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CZ3 , Abstract 

O ■ 

The structure of large networks can be revealed by partitioning 
I | them to smaller parts, which are easier to handle. One of such de- 

■ compositions is based on fc-cores, proposed in 1983 by Seidman. In 

| the paper an efficient, 0(m), m is the number of lines, algorithm for 

determining the cores decomposition of a given network is presented. 

O ' 

o ■ 

m . 
o ■ 

""^ I "One of the major concerns of social network analysis is identification of 

O ■ cohesive subgroups of actors within a network. Cohesive subgroups are 

J> , subsets of actors among whom there are relatively strong, direct, intense, 

k3j | frequent, or positive ties" ([7|, P- 249). Several notions were introduced 

5_i ■ to formally describe cohesive groups: cliques, n-cliques, re-clans, n-clubs, 

/c-plexes, fc-cores, lambda sets, . . . For most of them it turns out that they 
are algorithmically difficult (NP hard 4 or at least quadratic) , but for cores 
a very efficient algorithm exists. We describe it in details in this paper. 



1 Introduction 



2 Cores 

The notion of core was introduced by Seidman in 1983 [S]. 

Let G = (V, L) be a graph. V is the set of vertices and L is the set of 
lines (edges or arcs). We will denote n = \V\ and m = \L\. A subgraph 
H = (W,L\W) induced by the set IF is a k-core or a core of order k iff 
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Figure 1: 0, 1, 2 and 3 core 



Vf € W : deg H (v) > and H is a maximum subgraph with this property. 
The core of maximum order is also called the main core. The core number 
of vertex v is the highest order of a core that contains this vertex. 

The degree deg(i>) can be: in-degree, out-degree, in-degree + out-degree, 
. . . determining different types of cores. 

In figure ^ an example of cores decomposition of a given graph is pre- 
sented. From this figure we can see the following properties of cores: 

• The cores are nested: i < j Hj C Hi 

• Cores are not necessarily connected subgraphs. 

3 Algorithm 

Our algorithm for determining the cores hierarchy is based on the following 
property pQ: 

If from a given graph G = (V, L) we recursively delete all vertices, 
and lines incident with them, of degree less than k, the remaining 
graph is the /c-core. 

The outline of the algorithm is as follows: 

INPUT: graph G = (V, L) represented by lists of neighbors 
OUTPUT: table core with core number for each vertex 
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1.1 compute the degrees of vertices; 

1.2 order the set of vertices V in increasing order of their degrees; 
2 for each v G V in the order do begin 

2.1 core[v] := degree[v]; 

2.2 for each u € Neighbors(v) do 

2.2.1 if degree[u] > degree[v] then begin 

2.2.1.1 degree^] := degree[u] — 1; 

2.2.1.2 reorder F accordingly 
end 

end; 

In the refinements of the algorithm we have to provide efficient implemen- 
tations of steps 1.2 and 2.2.1.2. 

4 Detailed Algorithm 

We describe an implementation of the algorithm in a Pascal like language. 

Structure graph is used to represent a given graph G = (V, L). We will 
not describe the structure into details, because there are several possibilities, 
how to do this. We assume that the vertices of G are numbered from 1 to n. 
The user has also to provide functions size and in Neighbors, described 
in the table: 

name (parameters) returned value 

size (G) number of vertices in graph G 

u in Neighbors (G, v ) u is a not yet visited neighbor of vertex v in graph G 



Using an adequate representation of graph G (lists of neighbors) we can 
implement both functions to run in constant time. 

Two types of integer arrays (tableVert and tableDeg) are also in- 
troduced. Both of them must be of length at least n. The only difference 
is how we index their elements. We start with index 1 in tableVert and 
with index in tableDeg. 

The algorithm is implemented by procedure cores. The input is graph 
G, represented by variable g of type graph, the output is array deg of type 
tableVert containing core number for each vertex of graph G. 

We need (03-06) some integer variables and three additional arrays. Ar- 
ray vert contains the set of vertices, sorted by their degrees. Positions of 
vertices in array vert are stored in array pos. Array bin contains for 
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Algorithm 1: The Cores Algorithm for Simple Undirected Graphs 



01 procedure cores (var g: graph; var cleg: tableVert); 

02 var 

03 n, d, md, i, start, num: integer; 

04 v, u, w, du, pu, pw: integer; 

05 vert, pos : tableVert; 

06 bin: tableDeg; 
7 begin 

08 n:=size(g); md : = ; 

9 for v : = 1 to n do begin 

10 d := 0; for u in Neighbors (g, v) do inc(d); 

11 deg[v] := d; if d > md then md := d; 

12 end; 

13 for d := to md do bin [d] := 0; 

14 for v := 1 to n do inc (bin [deg [v] ] ) ; 

15 start := 1; 

16 for d := to md do begin 

17 num : = bin [d] ; 

18 bin[d] := start; 

19 inc (start, num) ; 
2 end; 

21 for v := 1 to n do begin 

22 pos [v] := bin [deg [v] ] ; 

23 vert [pos [v] ] := v; 

24 inc (bin [deg [v] ] ) ; 
2 5 end; 

2 6 for d := md downto 1 do bin[d] := bin[d-l]; 

2 7 bin[0] := 1; 

2 8 for i := 1 to n do begin 

2 9 v : = vert [ i ] ; 

30 for u in Neighbors (g, v) do begin 

31 if deg[u] > deg[v] then begin 

32 du : = deg[u]; pu := pos [u] ; 

33 pw := bin[du]; w := vert [pw] ; 

34 if u <> w then begin 

35 pos [u] := pw; vert [pu] := w; 

36 pos [w] := pu; vert [pw] := u; 

3 7 end; 

38 inc (bin [du] ) ; dec (deg [u] ) ; 

3 9 end; 

4 end; 

41 end; 

42 end; 
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Figure 2: Arrays 



each possible degree the position of the first vertex of that degree in array 
vert. See also Figure 12 in which a Pascal implementation of our algorithm 
for the case of simple undirected graph G = (V,E), E is the set of edges, is 
presented. 

In a real implementation of the proposed algorithm dynamically allo- 
cated arrays should be used. To simplify our description of the algorithm 
we replaced them by static. 

At the beginning we have to initialize some local variables and arrays 
(08-12). First we determine n, the number of vertices of graph g. Then we 
compute degree for each vertex v in graph g and store it into array 
deg. Simultaneously we also compute the maximum degree md. 

After that we sort the vertices in increasing order of their degrees 
using bin-sort (13-25). First we count (13-14) how many vertices will be in 
each bin (bin consists of vertices with the same degree). Bins are numbered 
from to md. 

From bin sizes we can determine (15-20) starting positions of bins in 
array vert. Bin starts at position 1, while other bins start at position, 
equal to the sum of starting position and size of the previous bin. To avoid 
additional array we used the same array (bin) to store starting positions of 
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bins. Now we can put (21-25) vertices of graph G into array vert. For each 
vertex we know to which bin it belongs and what is the starting position of 
that bin. So we can put vertex to the proper place, remember its position 
in table pos, and increase the starting position of the bin we used. The 
vertices are now sorted by their degrees. 

In the final step of initialization phase we have to recover starting 
positions of the bins (26-27). We increased them several times in previous 
step, when we put vertices into corresponding bins. It is obvious, that the 
changed starting position is the original starting position of the next bin. 
To restore the right starting positions we have to move the values in array 
bin for one position to the right. We also have to reset starting position of 
bin to value 1. 

The cores decomposition, implementing the for each loop from the 
algorithm described in section 3, is done in the main loop (28-41) that runs 
over all vertices v of graph g in the order, determined by table vert. The 
core number of current vertex v is the current degree of that vertex. This 
number is already stored in table deg. For each neighbor u of vertex v with 
higher degree we have to decrease its degree and move it for one bin to the 
left. Moving vertex u for one bin to the left is operation, which can be done 
in constant time. First we have to swap vertex u and the first vertex in 
the same bin. In array pos we also have to swap their positions. Finally 
we increase starting position of the bin (we increase previous and reduce 
current bin for one element). 

4.1 Time complexity 

We shall show that the described algorithm runs in time 0(max(m, n)). 

To compute (08-12) the degrees of all vertices we need time 0(max(m, n)) 
since we have to consider each line at most twice. The bin sort (13-27) con- 
sists of five loops of size at most n with constant time 0(1) bodies - therefore 
it runs in time O(n). 

The statement (29) requires a constant time and therefore contributes 
0{n) to the algorithm. The conditional statement (31-39) also runs in con- 
stant time. Since it is executed for each edge of G at most twice the contri- 
bution of (30-40) in all repetitions of (28-41) is 0(max(m, n)). 

Summing up — the total time complexity of the algorithm is 0(max(m, n)). 
Note that in a connected network m > n — 1 and therefore 0(max(m, n)) = 
0(m). 
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4.2 Adaption of the algorithm for directed graphs 

For directed simple graphs without loops only few changes in the implemen- 
tation of the algorithm are needed depending on the interpretation of the 
degree. In the case of in-degree and out-degree the function in Neighbors 
must return next not yet visited in-neighbor and out-neighbor respectively. 
If degree is defined as in-degree + out-degree, the maximum degree can 
be at most In — 2. In this case we must provide enough space for table 
bin (2n — 1 elements). Function in Neighbors must return next not yet 
visited in-neighbor or out-neighbor. 

5 Example 

We applied the described algorithm for cores decomposition on a network 
based on the Knuth's English dictionary This network has 52652 vertices 
(English words having 2 to 8 characters) and 89038 edges (two vertices are 
adjacent, if we can get one word from another by changing, removing or 
inserting a letter). The obtained network is sparse: density is 0.0000642. 
The program took on PC only 0.01 sec to compute the core numbers. In 
the table below the summary results are presented. 

Vertices with core number are isolated vertices. Vertices with core 
number 1 have only one neighbor in the network. The 25-core (main core) 
consists of 26 vertices, where each vertex has at least 25 neighbors inside the 
core (obviously this is a clique). The corresponding words are a' s, b' s, 
c' s, . . . , y' s, z' s. 

The 16-core has additional 34 vertices (an, on, ban, bon, can, con, 
Dan, don, eon, fan, gon, Han, hon, Ian, ion, Jan, Jon, man, Nan, non, 
pan, pon, ran, Ron, San, son, tan, ton, van, von, wan, won, yon, Zan). 
There are no edges between vertices with core number 25 and vertices with 
core number 16. The adjacency matrix of the subgraph induced by these 34 
vertices is presented on figure El In this matrix we can see two 17-cliques 
and some additional edges. 

The 15-core has additional 16 vertices (ow, bow, cow, Dow, how, jow, 
low, mow, now, pow, row, sow, tow, vow, wow, yow). This is a clique 
again, because only the first letters of the words are different. 
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6 Conclusion 

The cores, because they can be efficiently determined, are one among few 
concepts that provide us with meaningful decompositions of large networks. 
We expect that different approaches to the analysis of large networks can 
be built on this basis. For example, the sequence of vertices in sequential 
coloring can be determined by their core numbers (combined with their 
degrees). Cores can also be used to reveal interesting subnetworks in large 
networks [HUH- 

The described algorithm is implemented in program for large networks 
analysis Pajek (Slovene word for Spider) for Windows (32 bit) jQ. It is 
freely available, for noncommercial use, at its homepage: 

http : //vlado . fmf . uni-1 j . si/pub/networks/pa jek/ 
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Figure 3: Adjacency matrix of 16-core without 25-core 
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